Exploring Surface-Level Heuristics for Negation and Speculation Discovery in Clinical Texts

نویسندگان

  • Emilia Apostolova
  • Noriko Tomuro
چکیده

We investigate the automatic identification of negated and speculative statements in biomedical texts, focusing on the clinical domain. Our goal is to evaluate the performance of simple, Regex-based algorithms that have the advantage of low computational cost, simple implementation, and do not rely on the accurate computation of deep linguistic features of idiosyncratic clinical texts. The performance of the NegEx algorithm with an additional set of Regex-based rules reveals promising results (evaluated on the BioScope corpus). Current and future work focuses on a bootstrapping algorithm for the discovery of new rules from unannotated clinical texts.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Negation and Speculation Target Identification

Negation and speculation are common in natural language text. Many applications, such as biomedical text mining and clinical information extraction, seek to distinguish positive/factual objects from negative/speculative ones (i.e., to determine what is negated or speculated) in biomedical texts. This paper proposes a novel task, called negation and speculation target identification, to identify...

متن کامل

A review corpus annotated for negation, speculation and their scope

This paper presents a freely available resource for research on handling negation and speculation in review texts. The SFU Review Corpus, consisting of 400 documents of movie, book, and consumer product reviews, was annotated at the token level with negative and speculative keywords and at the sentence level with their linguistic scope. We report statistics on corpus size and the consistency of...

متن کامل

Effective Bio-Event Extraction Using Trigger Words and Syntactic Dependencies

The scientific literature is the main source for comprehensive, up-to-date biological knowledge. Automatic extraction of this knowledge facilitates core biological tasks, such as database curation and knowledge discovery. We present here a linguistically inspired, rule-based and syntax-driven methodology for biological event extraction. We rely on a dictionary of trigger words to detect and cha...

متن کامل

Detecting Negated and Uncertain Information in Biomedical and Review Texts

The thesis proposed here intends to assist Natural Language Processing tasks through the negation and speculation detection. We are focusing on the biomedical and review domain in which it has been proven that the treatment of these language forms helps to improve the performance of the main task. In the biomedical domain, the existence of a corpus annotated for negation, speculation and their ...

متن کامل

Prioritize the ordering of URL queue in Focused crawler

The enormous growth of the World Wide Web in recent years has made it necessary to perform resource discovery efficiently. For a crawler it is not an simple task to download the domain specific web pages. This unfocused approach often shows undesired results. Therefore, several new ideas have been proposed, among them a key technique is focused crawling which is able to crawl particular topical...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010